Heterogeneous acceleration of volumetric JPEG 2000 using OpenCL

نویسندگان

  • Jan G. Cornelis
  • Jan Lemeire
  • Tim Bruylants
  • Peter Schelkens
چکیده

This paper discusses an OpenCL version of a volumetric JPEG 2000 codec that runs on GPUs, multi-core processors or a combination of both. Since the performance critical part consists of a fine-grained (discrete wavelet transform) and coarse-grained algorithm (Tier-1), the best performance is obtained with a hybrid execution in which the discrete wavelet transform is executed on a GPU and Tier-1 on a multi-core. Using an Intel i7 multi-core in combination with a modest NVIDIA Quadro K620 GPU yields speedups greater than 10 compared with the original sequential code. The performance bottlenecks that arise on GPUs when parallelizing algorithms that are coarse-grained by nature are discussed and also the optimizations that are possible. A performance analysis reveals the inefficiencies and explains the deviations from the GPU peak performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of ‘OpenCL for FPGA’ for Data Acquisition and Acceleration in High Energy Physics

The increase in the data acquisition and processing needs of High Energy Physics experiments has made it more essential to use FPGAs to meet those needs. However harnessing the capabilities of the FPGAs has been hard for anyone but expert FPGA developers. The arrival of OpenCL with the two major FPGA vendors supporting it, offers an easy software-based approach to taking advantage of FPGAs in a...

متن کامل

The Support of an Experimental OpenCL Compiler on HSA Environments

In recent years, with the increasing computing power and programmability on GPU, GPU has become an important role on hardware accelerator. Heterogeneous System Architecture (HSA) announced by HSA Foundation is an approach to benefit both CPUs and GPUs advantages. Open Computing Language (OpenCL) is one of the wellknown programming frameworks for parallel computing on heterogeneous architecture....

متن کامل

Energy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL

Modern SoCs are getting increasingly heterogeneous with a combination of multi-core architectures and hardware accelerators to speed up the execution of computeintensive tasks at considerably lower power consumption. Modern FPGAs, due to their reasonable execution speed and comparatively lower power consumption, are strong competitors to the traditional GPU based accelerators. High-level Synthe...

متن کامل

Wavelet based volumetric medical image compression

The amount of image data generated each day in health care is ever increasing, especially in combination with the improved scanning resolutions and the importance of volumetric image data sets. Handling these images raises the requirement for efficient compression, archival and transmission techniques. Currently, JPEG 2000's core coding system, defined in Part 1, is the default choice for medic...

متن کامل

Lossless Volumetric Medical Image Compression with Progressive Multi-planar Reformatting Using 3-d Dpcm

In this paper, we propose a novel lossless volumetric medical image compression scheme using three-dimensional differential pulse code modulation (3-D DPCM), which provides an efficient procedure to achieve progressive multi-planar reformatting (MPR) of large 3-D medical data sets. Being separable and commutative in the order of its application, 3-D DPCM provides an opportunity to generate MPR ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJHPCA

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2017